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1 Introduction 



In this paper, we present new results on two problems: Stochastic Boolean 
Function Evaluation and Stochastic Submodular Set Cover (SSSC). We also 
consider the problem of ranking linear functions. 

1.1 Overview 

Stochastic Boolean Function Evaluation is the problem of determining the value 
of a given Boolean function / on an unknown input x, when each bit of Xi of 
x can only be determined by paying an associated cost Cj. The assumption is 
that x is drawn from a given product distribution, and the goal is to minimize 
the expected cost. This problem has been studied in the Operations Research 
literature, where it is known as "sequential testing" of Boolean functions. It 
has been studied in learning theory, in the context of learning with attribute 
costs. There have been a number of papers on exact algorithms for Stochastic 
Boolean Function Evaluation, but there is little work on developing approxima- 
tion algorithms ^MW5\ . 

Our main result in Stochastic Boolean Function Evaluation is a 3-approximation 
algorithm for evaluating Boolean linear threshold formulas. We also present an 
approximation algorithm for evaluating CDNF formulas (and decision trees) 
achieving a factor of 0(\ogkd), where k is the number of terms in the DNF 
formula, and d is the number of clauses in the CNF formula. 

In addition, we present approximation algorithms for simultaneous evalua- 
tion of linear threshold functions, and for ranking of linear functions. These 
problems are motivated by applications in databases. 

Our approximation algorithms are based on reductions to the Stochastic 
Submodular Set Cover (SSSC) problem. The SSSC problem was introduced by 
Golovin and Krause [H] This problem is a generalization of the Submodular 
Set Cover problem, which is in turn a generalization of the classical NP-complcte 
Set Cover problem. 

Golovin and Krause presented an approximation algorithm for the SSSC 
problem, which they called Adaptive Greedy. It is a generalization of the stan- 
dard greedy algorithm for the classical Set Cover problem. 

Our main result on the SSSC problem is a new approximation algorithm, 
which we call Adaptive Dual Greedy. It is an extension of the Dual Greedy 
algorithm for Submodular Set Cover due to Fujito, which is a generalization of 
Hochbaum's algorithm for the classical Set Cover Problem |151 [16] . We also give 
a new bound on the approximation achieved by the Adaptive Greedy algorithm 
of Golovin and Krause. 

We note that while we apply our new SSSC results solely to Boolean func- 
tion evaluation problems, the results may also be applicable to other stochastic 
problems with product distribution (or more narrowly, average case or uniform 
distribution) assumptions. 

lr They call it Stochastic Submodular Coverage. Our choice of name is for consistency with 
the terminology used in Fujito 1161 . 
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1.2 Detailed Summary of Results 



The Q-value approach: We begin by showing how to exploit the existing 
Adaptive Greedy algorithm of Golovin and Krause, for the SSSC problem, to 
obtain approximation algorithms for Stochastic Boolean Function Evaluation. 

We use the following basic approach, which we call the Q-value approach. 
First we reduce the evaluation problem to a binary SSSC problem, through the 
construction of what we call an assignment feasible utility function, with a goal 
value Q. Then we apply Adaptive Greedy to the reduced problem, and bound 
the quality of the approximation using their (In Q + 1) approximation bound on 
that algorithm. 

The Q-value approach easily yields our algorithm for evaluating CDNF for- 
mulas (or decision trees). As stated above, it achieves a solution that is within 
a factor of 0(logkd) of the optimal evaluation strategy, where k is the number 
of clauses of the input CNF and d is the number of terms of the input DNF. 
Previously, Kaplan et al. gave an approximation algorithm for this problem that 
also achieved an O(logfcd) approximation factor, but only for the special case 
of monotone formulas, unit costs, and the uniform distribution [29] rl 

We also show how to use the Q-value approach to develop an algorithm for 
evaluating linear threshold formulas with integer coefficients. Its approximation 
factor is (3(log£>), where D is the sum of the magnitudes of the coefficients of 
the linear threshold formula. This is a weak result, because we later show that 
different approach yields a constant approximation factor of 3 for this problem. 
Nevertheless, we adapt the algorithm later in the paper to obtain other results. 

We show that the Q-value approach has inherent limitations. We give neg- 
ative results showing that it will not give a sublincar approximation factor for 
evaluating read-once DNF (despite the fact that there is a polynomial-time exact 
algorithm for this problem [29j[2T]), or for evaluating linear threshold formulas 
with exponentially-large coefficients. In fact, we show that the weak 0(\ogD) 
approximation factor that we obtained for linear threshold formulas cannot be 
improved to be sublinear in n. To prove the negative results, we introduce a 
new combinatorial measure of a Boolean function, which we call its Q-value. 

Adaptive Dual Greedy: We present a new approximation algorithm for solv- 
ing the SSSC problem, Adaptive Dual Greedy. 

We prove an associated approximation bound of a, where a is a ratio that 
depends on the cover constructed by the algorithm. This is the main technical 
contribution of this paper. 

3- Approximation for Linear Threshold Formulas: Our 0(log D) approx- 
imation algorithm for evaluating linear threshold formulas, using the Q-value 

2 Kaplan et al. actually showed that their algorithm achieved a solution that was within 
O (log kd) of the expected certificate cost, which is a lower bound on the expected cost of the 
optimal evaluation strategy. The gap between expected certificate cost and expected cost 
of the optimal strategy can be large: for disjunction evaluation, with unit costs, under the 
distribution where each pi = + 1), the first measure is constant, while the second is 
fi(logn). Thus our result is stronger than that of Kaplan et al. in handling a significantly 
more general problem, but weaker in providing an approximation bound with respect to the 
optimal evaluation strategy, rather than with respect to the certificate size. 
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approach, worked by reducing the evaluation problem to an appropriate SSSC 
problem and applying the Adaptive Greedy Algorithm. We substitute Adaptive 
Dual Greedy for Adaptive Greedy in that algorithm. We show that in this case, 
a is bounded above by 3, and hence we get a 3-approximation algorithm. 

New bound on Adaptive Greedy: We prove a new bound on the Adaptive 
Greedy algorithm of Golovin and Krause, showing that it achieves an approxi- 
mation factor of 2(lnP+l) in the binary case (and fc(ln P+ 1) in the fc-ary case) 
where P is the maximum amount of utility that can be contributed by a single 
item. Except for the additional factor of 2, this bound generalizes a previous 
approximation bound of Wolsey for (non-adaptive) submodular set cover |40j . 
which in turn generalized the (Ins + 1) approximation bound for the classical 
set cover problem, where s is the maximum size of one of the input subsets 

(cf. [H]). 

Simultaneous Evaluation of Linear Threshold Formulas: We exploit 
some of the above techniques to obtain new approximation algorithms for si- 
multaneous evaluation of multiple threshold formulas. 

For the problem of simultaneous evaluation of m linear threshold formulas 
defined on x\, . . . ,x n , we give two related approximation algorithms, with ap- 
proximation factors of O (log mD avg ) and D max respectively. Here D avg and 
D m ax are the average and maximum, over the m threshold formulas, of the 
sum of the magnitude of coefficients of the variables in that formula. These 
results generalize previous results of Liu et al. for the simultaneous evaluation 
of m Boolean OR formulas (the shared filter ordering problem) [31 j - For OR 
formulas, our D max bound improves a bound of Liu ct al. by a factor of 2, using 
a deterministic rather than a randomized algorithm. 

Ranking of Linear Functions: We end by giving an approximation algorithm 
for the problem of ranking a set of m linear functions a%xi + . . . + a n x n (not 
linear threshold functions), defined over {0, 1}™, according to the value of their 
outputs. This ranking problem naturally appears in Web search and in database 
query processing, in presence of uncertainty in the input. For example, the 
linear functions may capture the "scores" assigned to documents or database 
tuples over a set of unknown input properties like user preferences, reputation 
of the underlying data source, etc., and our goal is to order the documents or 
tuples at minimal cost. The problem also arises naturally in the context of 
query processing over probabilistic databases, where the database tuples are 
annotated with probabilities and there is a cost to resolve the uncertainty in a 
tuple [28] . 

Our ranking algorithm achieves a factor of 0(log(mD max )) approximation, 
where D max is the maximum, over all the linear functions, of the sum of the 
magnitudes of the coefficients in that function. 

1.3 Organization of Paper 

In Section [2j we present an overview of the Stochastic Boolean Function Eval- 
uation algorithm, followed by a review of the related literature in Section [3] In 
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Section 21 we provide notation and definitions, background on the Stochastic 
Submodular Set Cover problem, and the greedy approximation algorithm for 
the SSSC problem. We apply the algorithm to Boolean Function Evaluation 
problems by reducing them to instances of SSSC in Section [5] and provide ap- 
proximation algorithms for CDNF and linear threshold function evaluation. We 
discuss the limitations of this approach in Section 15.41 and provide a new algo- 
rithm for the SSSC problem in Section [6] We apply the new algorithm to linear 
threshold function evaluation to obtain a 3-approximation algorithmSection [7J 
In Section [5J we provide a new bound for the standard greedy algorithm for 
SSSC. Finally, in Section^ we provide algorithms for simultaneous evalution 
and ranking of linear functions. A table of the notation used in this paper is 
provided in Appendix [S] 

2 Stochastic Boolean Function Evaluation 

The formal definition of the Stochastic Boolean Function Evaluation problem 
is as follows. The input is a representation of a Boolean function f(x±, . . . , x n ) 
from a fixed class of representations C, a probability vector p = (pi, . . . ,p n ), 
where < pi < 1, and a real- valued cost vector (ci,...,c„), where ct > 0. 
An algorithm for this problem must compute and output the value of / on an 
x £ {0, 1}™, drawn randomly from product distribution D p , such that pi — 
Prob[xi = 1]. However, it is not given direct access to x. Instead, it can 
discover the value of any Xj by "testing" it, at a cost of Cj. The algorithm must 
perform the tests sequentially, each time choosing the next test to perform. 
The algorithm can be adaptive, so the choice of the next test can depend on the 
outcomes of the previous tests. The expected cost of the algorithm is the cost it 
incurs on a random x from D p . (Note that since each pi is strictly between and 
1, the algorithm must continue doing tests until it has obtained a 0-certificate 
or 1-ccrtificate for the function.) The algorithm is optimal if it has minimum 
possible expected cost with respect to D p . 

Wc consider the running time of the algorithm to be the (worst-case) time it 
takes to determine the single next variable to be tested, or to compute the value 
of f(x) after the last test result is received. The algorithm corresponds to a 
Boolean decision tree (strategy) computing /, indicating the adaptive sequence 
of tests. 

Stochastic function evaluation problems arise in many different application 
areas. For example, in medical diagnosis, the Xi might correspond to medical 
tests performed on a given patient, where fix) = 1 if the patient should be 
diagnosed as having a particular disease. In a factory setting, the Xi might be 
the results of quality-control tests performed on a manufactered item, where 
fix) = 1 if the item should be sold. In query optimization in databases, / could 
correspond to a Boolean query on predicates corresponding to x\, . . . ,x n , that 
has to be evaluated for every tuple in the database to find tuples that satisfy 
the query [26l H3 Q3 [38] . 

There is a simple approximation algorithm for solving the Stochastic Boolean 
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Function Evaluation problem that works for all functions, and achieves an ap- 
proximation factor of n, even under arbitrary distributions: Simply test the 
variables in increasing order of their costs (cf. [H]). We therefore consider a 
factor of n approximation to be trivial; in designing our approximation algo- 
rithms for Boolean function evaluation, we want to achieve an approximation 
factor that is less than n, and preferably significantly less. 

If Boolean function / is given by its truth table, the Stochastic Boolean 
Function Evaluation Evaluation Problem can be exactly solved in time poly- 
nomial in the size of the input truth table (i.e., in time 2°( n )), using dynamic 
programming, using the approach in fFH 155] . 

There is a well-known polynomial-time algorithm that exactly solves the 
problem of evaluating a Boolean disjunction in our framework: simply test the 
variables in increasing order of the ratio Ci/pi (see, e.g., [17]). A symmetric 
algorithm works for conjunctions. There is also a polynomial-time exact al- 
gorithm for the more general problem of evaluating a fc-of-n function (i.e., a 
function that evaluates to 1 iff at least k of the input variables x\ are equal to 

i). EgnEaig]. 

The evaluation problem is also exactly solvable in polynomial time when / 
is a read-once DNF formula, but the complexity of the problem is open when / 
is a general read-once formula [7] HU HO] ■ 

The evaluation problem is NP-hard for linear threshold functions [TTj . How- 
ever, for the special case of unit costs and the uniform distribution, testing the 
variables in decreasing order of the magnitude of their coefficients is an optimal 
strategy [61114]. 

A survey by Unluyurt [39] covers results on exactly solving the Stochastic 
Boolean Function Evaluation problem, but does not consider approximation 
algorithms. 

3 Other Related Work 

There is a sample version of the Boolean function evaluation problem, where the 
input is a sample of of a given Boolean function / (i.e. a set of pairs (x, f(x))), 
and the problem is to build a decision tree consistent with the sample that mini- 
mizes the average cost of evaluation on assignments in the sample. Golovin et al. 
and Bcllala and Scott independently developed polynomial-time approximation 
algorithms for this problem, for arbitrary /, which achieve an O(logm) approx- 
imation factor, where to is the number of assignments in the sample [121 [3J. 
Moshkov and Chikalov proved bounds, also for arbitrary /, relating the average 
cost of an optimal strategy to the size of a certain combinatorial measure of the 
sample [33] . 

Moshkov considered a variant of this problem where the goal is to minimize 
the worst-case cost of evaluation, rather than the average cost. He gave a 
greedy algorithm that also achieves an O (log to) approximation factor [35] • The 
problem is NP-hard even when / is a conjunction; there is a polynomial-time 
4-approximation algorithm for that case [2 [T3l El] . 
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Kaplan et al. examined the more general problem of minimizing the expected 
cost of evaluating a Boolean function / with respect to a given arbitrary prob- 
ability distribution, where the distribution is given by a conditional probability 
oracle [221 • In the work of Kaplan et al., the goal of evaluation differs slightly 
from ours in that they require the evaluation strategy to output an "explana- 
tion" of the function value upon termination. They give as an example the 
case of evaluating a DNF that is identically true; they require testing of the 
variables in one term of the DNF in order to output that term as a certificate. 
In contrast, under our definitions, the optimal strategy for evaluating an identi- 
cally true DNF formula is one that simply outputs "true" and performs no tests 
(which is a zero-cost strategy). 

There are also a number of papers on building identification trees of mini- 
mum average cost, given S C {0,1}™, but that problem is fundamentally dif- 
ferent than function evaluation because each x £ S must have its own leaf (cf. 

my 

Charikar et al. [10] considered a different version of our function evaluation 
problem, which is not stochastic. Their goal is to minimize the worst-case ratio 
between the cost incurred in evaluating / on an input x, and the minimum cost 
of a certificate contained in x. 

As we discuss in Section [7] the linear threshold evaluation problem is related 
to the Min-Knapsack problem. There has been previous work on approximating 
the "stochastic knapsack" problem, but that problem is a version of the standard 
(max) knapsack problem, and does not appear to be relevant to our work. Han 
and Makino considered an on-line version of the Min-Knapsack where the items 
are given one-by-one over time |24j . 

There are a number of other non-adaptive versions of standard and submod- 
ular set cover that have been previously studied. In some, the cost of including 
an item j in the cover is not a constant. For example, Iwata and Nagano [27] 
studied the "submodular cost set cover" problem, where the cost of the cover 
is a submodular function that depends on the subsets in the cover. Beraldi 
and Ruszczynski addressed a probabilistic set cover problem where the set of 
elements covered by each input subset is a random variable, and full coverage 
must be achieved with a certain given probability [5]. 

4 Preliminaries 

In this section we provide notation and definitions used in the paper, as well as 
necessary background on the SSSC problem. 

4.1 Basic notation and definitions 

A partial assignment is a vector b € {0,1,*}™. We view a partial assignment 
as an assignment to the variables x\, . . . ,x n . For partial assignment b, we use 
dom(b) to denote the set {a;j|6j ^ *}. We will frequently use a partial assignment 
b to represent the outcomes of binary tests, where for I £ {0, 1}, bi = I indicates 
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that test i was performed and had outcome I, and bi = * indicates that test i 
was not performed. 

For partial assignments a, b £ {0, 1, *}", a is an extension of b, written a ~ &, 
if a, = fa; for all bi ^ *. We also say that b is contained in a. 

Given Boolean function / : {0, 1}" — > {0, 1}, a partial assignment 6 £ 
{0,1,*}™ is a 0-certificate (1-certificate) of / if f(a) = (/(a) = 1) for all 
a such that a ~ 6. Given a cost vector c = (ci, . . . , c„), the cost of a certificate 

Let AT = {1, ... , n}. 

In what follows, we assume that utility functions are integer- valued. In 
the context of standard work on submodularity, a utility function is a function 
g : 2 N -> Z> . Given 5 C TV and j £ JV, ffs(i) denotes the quantity g(S\J{j})- 

g(s). 

We will also use the term utility function to refer to a function g : {0, 1, *}" — > 
Z>o defined on partial assignments. Let g : {0, 1, *}" — > Z>o, be such a utility 
function, and let b £ {0, 1, *}". For S CN, let b s £ {0, 1, *}" where bf = h for 
i £ S, and &f = * otherwise. We define g(S,b) = g(b s ). For j £ iV, we define 

9s,b(j)=9(S{J{j},b)-9(S,b)- 

For 2 £ {0,1,*}. the quantity b Xi <-i denotes the partial assignment that is 
identical to b except that bi = I. We define gb(i, I) = g(b Xi ^i) — g(b) if bi = *, 
and gb{i,l) = otherwise. When b represents test outcomes, and test i has 
not been performed yet, gb(i, I) is the change in utility that would result from 
adding test i with outcome I. 

Given probability vector p = (pi, . . . ,p n ), we use x ~ D p to denote a random 
x drawn from product distribution D p . For fixed D p , b £ {0, 1, *}", and i £ N, 
we use E[g b (i)] to denote the expected increase in utility that would be obtained 
by testing i. In the binary case, E[g b (i)] = Pigb(i, 1) + (1 — Pi)gb(h 0). Note that 
E[g b (i)] = if bt ^ *. 

Utility function g : {0, 1}™ — ► Z>o is monotone if for b £ {0, 1, *}", i £ N 
such that bi — *, and / £ {0, 1}, g(b Xi <^i) ~ g(b) > 0; in other words, additional 
information can only increase utility. Utility function g is submodular if for all 
b,b' £ {0,1,*}" and I £ {0,1}, g(b x ^i) - g(b) > g(b' Xi ^) - g(b') whenever 
b' ~ b and bi = b\ = *. In the testing context, if the n test outcomes are 
predetermined, submodularity means that the value of a given test (measured 
by the increase in utility) will not increase if we delay that test until later. 

4.2 The Stochastic Submodular Set Cover (SSSC) prob- 
lem 

The SSSC problem is similar to the Stochastic Boolean Function Evaluation 
problem, except that the goal is to achieve a cover, rather than to evaluate a 
Boolean function!! More precisely, let O be a finite set of states, where * g 1 O. 
The input consists of the set N = {1, . . . ,n}, a cost vector (ci, . . . , c„), where 

3 To simplify the exposition, we define the SSSC Problem in terms of integer valued utility 
functions only. 
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each Cj > 0, a probability vector p = (pi, . . . ,p n ) where each pj 6 [0, 1], an 
integer Q > 0, and a utility function g : (0{J{*}) n — > Z> . Further, g(x) = 
if x is the vector that is all *'s, and g(x) = Q if x E O n . We call Q the goal 
utility. We say that b G (0 U{*})" i s a ccwer if <?(°) = Q- The cost of cover b is 

Each item j £ N has a state G C We sequentially choose items from 
N. Each time wc choose an item, we observe its state Xj (we "test" item j). 
The states of the items chosen so far are represented by a partial assignment 
b G (OIJ*)™- We call 6 a state vector. When g(b) = Q, we have a cover, and 
we can output it. The goal is to determine the order in which to choose the 
items, so as to minimize the expected cost incurred in testing. We assume that 
an algorithm for this problem will be executed in an on-line setting, and that 
it can be adaptive. We will present our algorithms for the binary case, where 
O = {0,1}, but will briefly discuss the straightforward extensions to the A:-ary 
case, where O = {0, 1, . . . , k — 1}, k > 2. 

Stochastic Submodular Set Cover is a generalization of Submodular Set 
Cover, which is in turn a generalization of the standard (weighted) Set Cover 
problem, which wc call Classical Set Cover. In Classical Set Cover, the input is 
a finite set X called the ground set, a set F = {Si, ... , S m } where each Sj CI, 
and a cost vector c = (ci, . . . , c m ) where each Cj > 0. The problem is to find a 
min-cost "cover", where a cover is a subset F 1 C F such that {J s . eF , Sj = X, 
and the cost of cover F' is X)s-eF' c i- 

In Submodular Set Cover, the input is a cost vector c = (ci, . . . , c„), where 
each Cj > 0, and a utility function g : 2 N —> Z>o such that g is monotone and 
submodular, g(0) = 0, and g(N) = Q. The goal is to find a subset S C N such 
that g(S) = Q and X^eS c i ^ s mm i m i zc d- Submodular Set Cover can be viewed 
as a special case of Stochastic Submodular Set Cover in which each pj is equal 
to 1. 

4.3 The Adaptive Greedy algorithm for Stochastic Sub- 
modular Set Cover 



1: &•<—(*,*...,*) j jb assigns * to all Xi 

2: I <r- 0, F° <r- 

3: while b is not a solution to SSSC (/(&) < Q) do 

4: I -G- I + 1 

5: ^~ arg min „, Cj , 

6: k the state of ji //"test" ji 

7: F l <- F 1 - 1 UO'l} ll Fl = dom(b) 

8: bj l G- k 

9: end while 

10: return b 

Algorithm 1: Adaptive Greedy 



The Classical Set Cover problem has a simple greedy approximation algo- 
rithm that repeatedly chooses the subset with the "best bang for the buck" for 
inclusion in the cover - i.e., the subset covering the largest number of new ele- 
ments covered per unit cost. The generalization of this algorithm to Submodular 
Set Cover, and its analysis, is due to Wolsey jJD]. It repeatedly chooses the ele- 
ment that will add the maximum amount of additional utility per unit cost |40j. 
The Adaptive Greedy algorithm of Golovin and Krause, for the SSSC problem, 
is a further generalization of this greedy approach. It repeatedly chooses to 
test the element with the maximum expected increase in utility, per unit cost. 
(Golovin and Kruase actually formulated Adaptive Greedy for use in solving a 
somewhat more general problem than SSSC, but here we describe it only as it 
applies to SSSC.) 

We present the pseudocode for Adaptive Greedy in Algorithm 1. Some of 
the variables used in the pseudocode are not necessary for the running of the 
algorithm, but are useful in its analysis. (In Step[5J assume that if E[gi,(x)] = 0, 
the expression evaluates to 0.) 

Golovin and Krause proved that Adaptive Greedy achieves a solution that 
is within a factor of (InQ + 1) of optimal, where Q is the goal utility. We will 
make repeated use of this bound. 

5 Function Evaluation and the SSSC Problem 

We design approximation algorithms for solving Stochastic Boolean Function 
Evaluation problems by reducing them to Stochastic Submodular Set Cover 
problems. 

5.1 The Q- value approach 

We introduce the following definition: 

Definition: Let f(x\, . . . ,x n ) be a Boolean function. Let g : {0, 1, *}" — > Z>o 
be a utility function. We say that g is assignment feasible for f , with goal value 
Q, if (1) g is monotone and submodular, (2) g(*, *,...,*) =0, and (3) for all 
b G {0, 1, *}", g(b) = Q iff b is either a 0-certificatc or a 1-ccrtificatc of /. 

More particularly, we propose the following generic approach to solving 
Boolean function evaluation problems, which we call the Q-value approach. To 
evaluate Boolean function /, with respect to probability vector p and cost vec- 
tor c, we first construct an assignment feasible utility function g for / with goal 
value Q. We then apply Adaptive Greedy to solve the SSSC problem on inputs 
g, Q, c, and p. Because g(b) = Q iff b is either a 0-certificate or a 1-certificatc 
of /, the decision tree that is (implicitly) output by Adaptive Greedy is also a 
solution to the evaluation problem for /. By the bound of Golovin and Krause 
on Adaptive Greedy, this solution is within a factor of (InQ + 1) of optimal. 

The challenge in using the above approach is in constructing g. Not only 
must g be assignment feasible for /, but the goal value Q should be subcxponen- 
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tial, in order to obtain a good approximation bound. Wc will use the following 
lemma, due to Guillory and Bilmes, in our construction of g. 

Lemma 1. JIf Let g : {0, 1, *}" -> Z> , ' g t : {0, 1, *}" -)• Z> , and Q , Qi G 

Z>o &e such that go and g± are monotone, submodular utility functions, 5o(*, *,...,*) = 

5i(*, *,...,*) = 0, and 50(a) < Qq and 51(a) < Q\ for all a £ {0, 1}™. 

Let Qv = Q0Q1 and let g v : {0, 1, *}" — > Z>o &e such that g\j{b) = Q v — 

(Qo-9o(b))(Qi-gi(b)). 

Let Q/\ = Qq + Qi and let g A : {0,1,*}™ — > Z>o &e suc/i iraat g A (6) = 
<?o(6) + 5i(o). 

T/ien 6oi/i fl 71 ^ 5a are monotone and submodular, and g v (*,..., *) = 
5a(*, ...,*)= 0. For aH 6 e {0, 1, 5v (6) = Q v tffg (b) = Q Q or 9l (b) = Q u 
or both. Further, g^(b) = Q A iff go (b) = Qo and gi(b) = Q\. 

5.2 CDNF evaluation via the Q-value approach 

Using the Q- value approach, together with Lemma [I] it is easy to derive an 
algorithm for evaluating CDNF formulas. A CDNF formula for a Boolean func- 
tion / is a pair (0o,0i) where 4>o and 4>\ are CNF and DNF formulas for /, 
respectively. 

Theorem 1. There is a polynomial-time approximation algorithm solving the 
function evaluation problem for CDNF formulas, which achieves a solution that 
is within a factor of O(logfcd) of optimal, where k is the number of clauses in 
the input CNF, and d is the number of terms in the input DNF. 

Proof. Let </>o be the input CNF and <\>\ be the input DNF, both defined on 
{0,1}". Let / be the Boolean function defined by these formulas. Let k and 
d be, respectively, the number of clauses and terms of 4>o an d <Pi- Let go ■ 
{0, 1, *}" — >• Z>o be such that for a £ {0, 1, *}", 50(a) is the number of terms 
of <j)\ set to by a (i.e. terms with a literal x% such that a; = 0, or a literal ->Xi 
such that di = 1). Similarly, let 51(a) be the number of clauses of </>o set to 1 
by a. Clearly, 50 and 51 are monotone and submodular. Partial assignment b is 
a 0-certificate of / iff 50(0) = d and a 1-certificate of / iff 51(6) = k. Applying 
the disjunctive construction of Lemma [T] to 51 and 50, wc get a utility function 
5 that is assignment feasible for / with goal value Q = kd. Applying Adaptive 
Greedy and the (InQ + 1) bound yields the theorem. □ 

Given a decision tree computing a Boolean function /, it is easy to construct 
CNF and DNF formulas for the function, each of which has size at most equal 
to the number of leaves of the tree. Thus the above theorem also gives an 
approximation algorithm within a factor 0(ln(t)) of optimal, where t is the 
number of leaves in the tree. 

5.3 Linear threshold evaluation via the Q-value approach 

A linear threshold formula with integer coefficients has the form X)"=i a i x i ^ 
where the a^ and 9 are integers. It represents the Boolean function / : {0, 1}" — > 
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{0, 1} such that f(x) = 1 if Yl7=i aiXi — ®> ano - f( x ) = otherwise. 

We now show how to use the Q-value approach to obtain an algorithm 
solving the function evaluation problem for linear threshold formulas with in- 
teger coefficients. The algorithm achieves a solution that is within a factor of 
O(logD) of optimal, where £!Li a i x i — @ i s the input linear threshold formula 
and D = Ym=i \ a i\- This algorithm, like the CDNF algorithm, works by reduc- 
ing the evaluation problem to a stochastic version of Submodular Set Cover. 
One difference between them is that the CDNF algorithm actually reduces the 
evaluation problem to a stochastic version of Classical Set Cover problem (each 
Xi covers a subset of the (term, clause) pairs), whereas here there is no associated 
Classical Set Cover problem. 

Let h(x) = (£"=i — For b e {0,1,*}, let min(b) = min{h(b') : 
b' e {0, 1}" and b' ~ b] and let max{b) = max{/i(6') : b' £ {0, 1}" and b' ~ b}. 
Thus min(b) = (£ J:6j#H , a j b j)+(Ei :ai <o,b i =* a^ ) _6, ' max ( b ) = (£j.-&^* a A") + 
(Era >o b ■=* a «) — ^' an( ^ eacn can be calculated in linear time. Let Rmin = 
mm(*, ...,*) and R ma x = max(*, . . . , *). If i? mi „ > or i? ma2; < 0, / is 
constant and no testing is needed. Suppose this is not the case. 

Let Qi = —Rmin and let submodular utility function g\ be such that gi(b) = 
min{—R m i n ,min(b) — R m in}, Intuitively, Q\ — gi(b) is the number of different 
values of h that can be induced by extensions b 1 of b such that fib') = 0. 
Similarly, define g (b) = min{R max + 1, R max - max(b)} and Q = Rmax + 1. 
Thus b is a 1-certificate of / iff gi(b) = Qi, and a 0-certificate iff go(b) = Qo- 

We apply the disjunctive construction of Lemma [1] to construct g(b) = Q — 
(Qi ~ 9i( b ))(Qo — <7o(fr))i which is an assignment feasible utility function for 
/ with goal value Q = QiQo- Finally, we obtain an O(logD) approximation 
bound by simply applying the (InQ + 1) bound on Adaptive Greedy. 

The quantity D depends on the magnitudes of the coefficients and can be 
exponential in n, the number of variables. One might hope to obtain a better ap- 
proximation factor, still using the Q- value approach, by designing a more clever 
assignment-feasible utility function with a much lower goal-value Q. However, 
in the next section we explore the limitations of the Q-value approach, and show 
that this is not possible. Achieving a 3-approximation for this problem, which 
we do in Section [7J does indeed require a different approach. 

5.4 Limitations of the Q- value approach 

The Q-value approach depends on finding an assignment feasible utility function 
g for /. We first demonstrate that a generic such utility function g exists for 
all Boolean functions /. Let Q n = \{a G {0, l}"|/(a) = 0}| and Q 1 = \{a € 
{0, l}"|/(a) = 1}. For partial assignment b, let g (b) = Qo - \{a e {0, l} n \a ~ 
b,f(a) = 0}| with goal value Q , and let gi(b) = Q\-\{a E {0,l} n |a ~ b, f(a) = 
1}| with goal value Q\. Then go, Q , gi and Qi obey the properties of LemmaQ] 
Apply the disjunctive construction in that lemma, and let g be the resulting 
utility function. Then g is assigment feasible for / with goal value Q — QiQo- 
In fact, this g is precisely the utility function that would be constructed by the 
approximation algorithm of Golovin et al. for computing a consistent decision 
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tree of min-expected cost with respect to a sample, if we take the sample to be 
the set of all 2 n entries {x, f(x)) in the truth table of / [19]. 

The goal value Q of this g is 2 e ( n \ so in this case the bound for Adaptive 
Greedy, (InQ + 1), is linear in n. 

Since we want an approximation factor that is sublinear in n, we would 
instead like to construct an assignment-feasible utility function for / whose goal 
value Q is sub-exponential in n. However, we will show that this is not possible 
even for some simple Boolean functions /. We begin by introducing the following 
combinatorial measure of a Boolean function, which we call its Q-value. 

Definition: The Q-valuc of a Boolean function / : {0, 1}™ — > {0, 1} is the 
minimum integer Q such that there exists a assignment feasible utility function 
g for / with goal value Q. 

The generic utility function g given above shows that Q-value of every in- 
variable Boolean function is upper bounded by 2°^ n \ To lower bound the Q- 
values of some particular functions, we will use the following technical lemma. 

Lemma 2. Let f(xi, . . . , x n ) be a Boolean function, where n is even. Further, 
let f be such that for all n' < n/2, and for all b G {0, 1, *}", if 6; = b n ^+i = 
* for all i G {n' + l,...,n/2}, the following properties hold: (1) if for all 
i G {1, ...,n'}, exactly one of bi and b n /2+i is equal to * and the other is 
equal to 1, then b is not a 0-certificate or a 1-certificate of f and (2) if for all 
i G {1, . . . , n' — 1}, exactly one of 6j and b n / 2 +i is equal to * and the other is 
equal to 1, and b n i = b n /2+ n ' = 1> then b contains a 1-certificate of f . Then the 
Q-value of f is at least 2™/ 2 . 

Proof. Let / have the properties specified in the lemma. For bitstrings r, s G 
{0,1}', where < I < n/2, let d ns G {0,1,*}™ be such that di = r { and 
d n /2+i = s i for i G {1, . . . , I}, and di — * for all other i. 

Suppose g is an assignment feasible utility function for / with goal value Q. 
We prove the following claim. Let < I < n/2. Then there exists r,s G {0,1,*}' 
such that < Q — g(d r , s ) < Q/2 1 , and for all i G {1, . . . , I}, either r.i = 1 and 
Si = *, or Ti = * and Sj = 1. 

We prove the claim by induction on !. It clearly holds for I = 0. For 
the inductive step, assume it holds for I. We show it holds for I + 1. Let 
r, s G {0, 1, *}' be as guaranteed by the assumption, so Q — g(d TtS ) < n/2 1 . 

For a G {0, 1, *}, ra denotes the concatenation of bitstring r with a, and 
similarly for sa. By the conditions on / given in the lemma, d r ^ s is not a or 
1-certificate of /. However, c? r i jS i is a 1-certificate of / and so <7(c? r i,si) = Q- 

If Q — g(d r i.s*) < Q/2 l+1 , then the claim holds for I + 1, because rl, s* have 
the necessary properties. 

Suppose Q — g(<i r i ;S *) > Q/2 l+1 . Then, because g(d r i,si) = Q, g{d r i.si) — 
g(d r i, s *) > Q/2 l+1 . Note that e? r i iS i is the extension of d r i, s * produced by set- 
ting d„/2+;+i to 1. Similarly, e? r * ;S i is the extension of d r * iS * produced by set- 
ting d n /2+i+i to 1. Therefore, by the submodularity of g, g(d rSf S i) — <?(d r *, s *) > 
3(dri,si) - ff(dri, s *), and thus g(d,* jS i) - 5(d r * : s*) > Q/2 l+l . 
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Let A = g(d r * iS i) — g(d r *,s*) and B = Q — g(d r * !S i). Thus A > Q/2 l+1 , and 
A + B = Q — g(d r * t s*) = Q — g(d r ,s) < Q/2 ( where the last inequality is from the 
original assumptions on r and s. It follows that B = Q — <?(G? r *. s i) < Q/2 l+1 , 
and the claim holds for I + 1, because r*, si have the necessary properties. 

Taking I = n/2, the claim says there exists d rjS such that Q — g(d r ^ s ) < 
Q/2"/ 2 . Since g is integer- valued, Q > 2 n l 2 . □ 

The above lemma immediately implies the following theorem. 

Theorem 2. Let n be even, and let / : {0, 1}" — > {0, 1} be the Boolean function 
represented by the read-once DNF formula <f) = t\ V t% V . . . V t n / 2 where each 
ti = XiX n /2+i- The Q -value of f is at least 2™/ 2 . 

The above theorem shows that our Q-vahie approach to approximating func- 
tion evaluation problems will not yield a good approximation bound for either 
read-once DNF formulas or for DNF formulas with terms of length 2. 

In the next theorem, we show that there is a particular linear threshold 
function whose Q-value is at least 2 n / 2 . It follows that the Q- value approach 
will not yield a good approximation bound for linear-threshold formulas cither. 

We note that the function described in the next theorem has been studied 
before. As mentioned in [25], there is a lower bound of essentially 2™/ 2 on the 
size of the largest integer coefficients in any representation of the function as a 
linear threshold formula with integer coefficients. 

Theorem 3. Let f{x\, . . . , x n ) be the function defined for even n, whose value 
is 1 iff the number represented in binary by bits x± . . . x n / 2 is strictly less than 
the number represented in binary by bits x n /2+i, ■ ■ ■ ,x n , and otherwise. The 
Q-value of f is at least 2 n l 2 . 

Proof. We define a new function: 

f (x\, . . . , x n ) — f(-ixi, . . . , -«„/2, x n / 2 +i, ■ • ■ , x n ) 
That is, f'(xi, . . . , x n ) is computed by negating the assignments to the first 
n/2 variables, and then computing the value of / on the resulting assignment. 
Function /' obeys the conditions of Lemma [21 and so has Q-vahie at least 2"/ 2 . 
Then / also has Q- value at least 2™/ 2 , because the Q-value is not changed by 
the negation of input variables. □ 

Since the Q-value approach is not always an effective way of achieving good 
approximation bounds, we can ask whether there are good alternatives. We 
prove a new bound on Adaptive Greedy in Section [3 which is O(logP), where 
P < Q is the maximum amount of utility gained by testing a single variable xi . 
This suggests we may want to look at P-valuc, rather than Q-value. However, 
since the results of tests of all n variables yields a utility value of Q, there must 
be at least one test that alone would yield a utility gain of at least Q/n. Thus 
P > Q/n, so whenever Q is exponential in n, so is P. Another possibility might 
be to exploit the fact that Golovin and Krause's bounds on Adaptive Greedy 
apply to a more general class of utility functions than the assignment feasible 
utility functions we consider here, but we do not pursue that possibility. 

Instead, we give a new algorithm for the SSSC problem. 
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6 Adaptive Dual Greedy 



We now present Adaptive Dual Greedy, our new algorithm for the binary version 
of the SSSC problem. We note that it easily extends to the /c-ary version, where 
k > 2, with no change in the approximation bound. 

Like Fujito's Dual Greedy algorithm for (non-adaptive) Stochastic Submod- 
ular Set Cover, our algorithm is based on the dual of the LP relaxation of 
Wolsey's IP for the (deterministic) Submodular Set Cover Problem. We present 
this dual LP in Figure [TJ 



Max Escn(9( n ) - 9(S))y s 
s.t. 

Escjv 9s{j)ys < Cj Vj G N 
ys>0 VSQN 

Figure 1: Dual LP for the (Deterministic) Submodular Set Cover Problem 

Here, g : 2 N —> Z>o is the input monotone submodular utility function. The 
variables of the dual LP are ys; one variable for each subset S C N. 

Now consider the binary Stochastic Submodular Set Cover (SSSC) problem 
on an input utility function g : {0, 1, *}" — > Z>o with goal utility Q, probability 
vector p and cost vector c. For x G {0, 1}™, define g x : 2 N — >• Z>o such that for 
S C N, g x (S) = g(S,x). Thus g x (N) = Q. 

A decision tree solving the binary SSSC problem on g defines, for each 
x G {0, 1}™, a set S of variables that will be tested (corresponding to the root- 
leaf path on x), such that the result of those tests achieves the goal utility Q. 
Thus S corresponds to a feasible solution to the (deterministic) Submodular 
Set Cover problem with input utility function g x . In effect, the decision tree 
is computing feasible solutions for 2™ Submodular Set Cover instances, one for 
each x G {0, 1}™. Each of these instances has a corresponding dual LP L x of 
the form given in Figure [T] with g = g x . Note that for S C N, j G N — S, 

9sU) = 9s,x(j)- 

We denote by q x the objective function of L x , so q x is J2scn(Q~ 9(S> x ))ys- 
We denote by hj >x {y) the left-hand side of the dual constraint for j in L x , so 

hj,x(v) is J2sQN9S,x{j)VS- 

The main idea behind Fujito's Dual Greedy algorithm is to iteratively assign 
values to a subset of the ys variables, which are all set initially to 0. The next 
variable to be assigned a value is chosen in a greedy way, and the assignment is 
done to maintain the invariant that the current assignment to the ys variables is 
a feasible solution to the dual LP. Each assignment causes another constraint of 
the dual to be tight. The item j corresponding to that constraint is then added 
to an initially empty set J. When J is a cover (i.e., g(J) = Q), the algorithm 
halts and outputs J. 

We would like to use a similar approach to solve Stochastic Submodular 
Set Cover. More particularly, we would like to iteratively assign values to the 
variables ys in the dual LP L x for x, each time making a constraint for some 
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j tight, and testing the associated variable Xj, until we have a cover (i.e., until 
g{b) = Q, where b is the state vector). 

The major difficulty with this approach is that initially we don't know x, 
so, in general, we don't know the values of the coefficients in L x , either in the 
constraints or in the objective function. Even if we have tested all variables in 
some S, we won't be able to compute the value of gs,x(j), the coefficient of ys 
in the constraint for j in L x , until we have tested Xj as well, and learned its 
state. 

To deal with this uncertainty, we modify our goal of constructing a feasi- 
ble solution to L x . Instead, we will construct a solution satisfying some new 
constraints. 

Given x £ {0,1}", and j <G N, let x' be the assignment produced from 
x by complementing Xj. We define a new constraint, hj x (y) < Cj, where 
h'j x {y) = J2scN(E[gs,x(j)])ys- We call this a combined constraint, because it is 
a combination of hj tX and hj tX >. If %j = 1, h'j x {y) = Pjhj. x (y) + (1 — pj)hj, x >(y) 
and a symmetric equality holds if Xj =0. Thus each x has n associated com- 
bined constraints, one for each j. 

Note that h'j x (y) and h'j x r(y) are identical, so x and x' (differing only in bit 
j) have a common combined constraint. Let x 3 denote the partial assignment 
identical to x except that Xj = *. We define h'- xj = h'j x , so there is also a 
combined constraint associated with partial assignment x J . 

In our algorithm, we construct a solution that satisfies the n combined con- 
straints h'- x associated with x, one for each j S N. We show these constraints 
in Figure [2] and present an example of a combined constraint in Figure [3] Once 
we have tested all variables in some S, we will be able to compute the value of 
E[gs, x {j)], the coefficient of ys in the combined constraint for j, even before we 
test j. We will take advantage of this fact in our algorithm. If, at any point, we 
do not have enough information to calculate the coefficient of some variable ys 
in a combined constraint, we make sure that ys is equal to at that time. 



Figure 2: Combined Constraints for x in Adaptive Dual Greedy 

Although we do introduce new constraints, we do not introduce a new ob- 
jective function. 

We present the pseudocode for Adaptive Dual Greedy in Algorithm [2] (In 
Step OH assume that if E[gb(x)] = 0, the expression evaluates to 0.) 

In Step the summation is only over S such that j/j ^ 0, and wc only assign 
a non-zero value to S after we have tested all elements in S. Thus in that step, 
for each non-zero ys, we have that S C dom(b), so gs,b{j) = 9s,x(j)- 

Consider running Adaptive Dual Greedy on a fixed x e {0, 1} N . Let y x 
denote the resulting assignment to the variables ys- Let b x be the returned 
value of b, so b x is the constructed cover. Let C(x) be the sequence of tested 
items, in the order in which they were tested. Let F(x) be the set of subsets F l 
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Suppose n = 2, so j 6 {1,2}. Let the adaptive submodular utility function g 
be defined as follows: 

g(**) = g(l*) = 3 g(*l) = 3 

3 (0*) = 2 #(*0) = 2 g(00)=4 

<?(01)=4 5 (10) =4 <?(11)=4 
Consider the following constraint for gj{$)y% + ffj({l})y{i} + ffj({2})y{2} + 
9j({l,2})y {lt2} . 

If .x = 11, then the left hand side of the constraint for j = 1, /h,ii, is equal to 
3j/0 + 0y {1} + ly {2} + 0y {1 , 2} . 

If x = 01, then the left hand side of the constraint for j = 1, hi oi, is 
equal to 2j/ + 0y { i} + ly {2 } + 0y { i i2 }- 

If all of the pis are 1/2 (i.e., x is drawn from the uniform distribution), then the 
left hand side of the combined constraint for j = 1 is equal to 2.5y0 + 0y{i} + 
!y{2} + 0y { i, 2 }- 

Figure 3: Example of a Combined Constraint 



1: b «-(*,*...,*), ys «— for all 5 C N 
2: F° = 0, I <- 

3: while b is not a solution to SSSC (g(b) < Q) do 

4: Z «- Z + 1 

c )-Es:j,#o( £ [ss,t(i)])!/s 
5: 7; arg mm — ttt, 

c h - J2 S yq ^o( E iBs,bU)])ys 

6: yp'- 1 < m 

7: fc the state of ji j / "test" jz 
8: F l <- F 1 - 1 \J{ji} j jF l = dom{b) 

9: bj l «- fc 

10: end while 

11: return & 

Algorithm 2: Adaptive Dual Greedy 



produced during the run. Note that if T is the number of while loop iterations, 
F T is the set of items in sequence C(x). In what follows, all expectations are 
with respect to the product distribution D p . 

Lemma 3. For every x £ {0, 1}™, and j £ {1, .., n}, 

h 'j,xi (y x ) = C J f or al1 3 G c ( x ) 

h 'j^(y x ) < °] f or al1 other j 

Proof. As in Fujito's non-adaptive algorithm, in Algorithm [21 we begin with a 
feasible solution to the dual. At each iteration of the algorithm, we update the 
solution so that it remains feasible, and one constraint is made tight. Those 
constraints remain tight until the algorithm terminates. 
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Formally, let T be the number of loop iterations of Adaptive Dual Greedy 
on x. Let y denote the value of assignment y at the end of the tth iteration. 
Thus y° is the all O's assignment. 

We show by induction that for t G {0, ...,T}, the following holds: (1) 
b! 3 xi (y*) < cj for all j, (2) // xJ (y*) = Cj for all j G F* , and (3) if S C N where 
S^{F°,... 7 F t ~ 1 },theny t s '=0 

The above clearly holds for t = 0, since y l is the all O's assignment. Suppose 
the three properties hold for t, where < t < T. We show they hold for t + 1. 

Consider the t + 1st iteration. In this iteration, an item ji is added to F f 
yielding F t+1 , and ypt is assigned a non-zero value. 

Since (3) holds for t, and only ypt is assigned a value during the t + 1st 
iteration, (3) also holds for t + 1. 

For all j G F\ g F t, x (j) = and since ti^tf) = Cj by (1), h' ]x] {y t+1 ) = Cj . 

Consider Step[5]of the algorithm. Note that h' jxj (y) = J2s:y s ^o( E i9s,bU)])ys 
Also, since F f = dom(b) at this step, E[gt,(j)] = E[gpt b (j)}. Because (1) and 
(2) hold at the start of the rth iteration, and gs,b{j) = gs,x(j) for each 5* where 
ys > 0, the criterion used to select jt+i ensures that the assignment to ypt 
causes h' jxi {y t+1 ) = c 3 for j = j t+1 and h' jxJ {y t+l ) < Cj for j £ F*lJ{jt+i}- 
Thus (1),(2),(3) hold for t + 1. □ 

The following lemma relates the cost of the cover constructed by Adaptive 
Dual Greedy to the left hand sides of the constraints of the L x . 

Lemma 4. E{£ jeC ( x ) h ]Av x )\ = E [T, je c(x) c j\ 

Proof. We will use the fact that Adaptive Dual Greedy corresponds to a decision 
tree, which we will call the dual-greedy tree. 

Let X 3 ={de {0, l,*} n \dj = *, d t e {0,1} for i ^ j }. For d G X j} let 
d' = rf x <_i and d" = d Xj ^o. So each d G Xj gives rise to two full assignments, 
d' and d" , depending on which value is assigned to the one unassigncd variable 
of d. We first prove the following claim for j G N. 

Claim: (p j h j , d iy d ') + (l-p j )h j ^(y d ")) = h' j4 (y d '), and h' jA (y d ') = h' jA {y d "). 
By the definition of h', for all d G Xj, h'- d (y x ) = Pjhj : d>(y x )+(1—Pj)hj } d"(y x )- 
For d G Xj, let Prob[d) denote the probability, with respect to x ~ D p , that 
Xi = di for all i ^ j. 

Let d G Xj. Consider the execution of Adaptive Dual Greedy on d' and d" , 
and the associated paths in the dual-greedy tree. Since d' and d" differ only 
in bit j, either j is not tested in either path, or the paths diverge at a test of 
j. Thus j G C(d') iff j G C(d"). Suppose the paths diverge at a test of j, and 
suppose it is the rth test on those paths. Then the corresponding sets F±, . . . , F t 
constructed by the algorithm are the same for d' and d", as are the assignments 
to yp t , . . . , yF t . For any t' > t, if there is a set F t > that is constructed for d' , then 
9F t ,{j) = 0, since j G F t C F t <. The same holds for d" . Comparing hj j d'{y d ) 
to hj y d"{y d ), it follows that whenever some ys is assigned a different value by 
y d and y d . the corresponding coefficients gs,d'{j) and gs,d"(j) are both 0. The 
claim now follows from the definitions of h and h' . 
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Using the claim, we can now show that 

jec(x) 

= J2 x Prob[ X ]( E, ec(x) ^>(^)) 

= E j E,, eC(:c) ^] HAv*) 

+ E^., ecK0 ^K'] W')) 

because for d = x 3 , x — d" if a; 3 - = and x = d if = 1 

+ Pro6[d"] h jid »(y d ") ) 
because j G C(d') iff j e C(d") 

= E,(E dex ,, ec(d0 ^Mfe^(/) 



= EiE« j:36C( /^(i- P ,^)^(/)) 



+ (i- Pj )% rf "(y d ) ) 

ob[d] (1 — pj + p 
by the claim, and because (1 — p j + pj) = 1 

= E,(E dex ,, eCK) (^M(i^,)^(/) 

+ Pr 6[rf] Pj ^. d (/') ) 
since /i;. d (/') = h! j4 {y d ") 

= E,( E deXj:jeCK) ( PTO& K] ^, d (y d ') + Pr 6K]^, d (/') ) 

= E/E,, eC(x) (^w^>^)) 
= E,E^) Pto ^ ^cir) 

= ^ ^ Pro6[x] c 3 by Lemma [3] 

= e[ y Cj } 

□ 

Let r be an arbitrary decision tree for the SSSC problem on g,c,p. Let 
C*(x) be the cover produced by r on x. 



Lemma 5. £[E 3eC -(x) h jAy x )} < 
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Proof. The proof is nearly the same as the proof of Lemma HI The y x in the 
statement of this lemma is the one constructed by Adaptive Dual Greedy; the 
only change is that we are summing over C*(x) rather than C(x). Thus the 
claim in the proof of Lemma[5]still holds. In the subsequent series of equations, 
we replace C(x) with C*(x). Because r is a tree, it still holds that j £ C*(d') 
iff j £ C*(d"). The only change we need to make is the end, where we use 
Lemma [3l Since the j £ C*(x) are not necessarily in C(x), we need to change 
the equality to an inequality. □ 

We now return to our analysis of the cover produced by Adaptive Dual 
Greedy. 

Lemma 6. E[J2jec(x) c j^ — a E[q x (y x )} where a = max —^z^nf^^ , with 
the max taken over all x £ {0, 1}™ and S £ F(x) such that the denominator is 
non-zero. 

Proof. Consider a run of Adaptive Dual Greedy on a fixed x. In iteration t of 
the while loop, a value is assigned to ys where S = F* . 

Let h x (y) = J2j£C(x) hj,x(y)- Clearly h x is a linear function of the variables 
ys, and for S C TV, the coefficient of ys in h x is E/ec(x) 9s,x{j)- If 5 = 
F 1 ^ 1 for some iteration t, then Q — g(F t ,x) ^ 0, and by the definition of a, 
Y,jes9s,x(J) < at(Q-g(S,x)). Since Q-g{S 1 x) is the coefficient of y s in q x (y), 
and y s > implies that S = for some iteration t, substituting y x for y we 
get J2jeC(x) hj,x{y x ) < otq x (y x ). Taking the expectation of both sides, we get 

E E2-*n, MAy X )]<aE[ q x (y x )] 

By LemmaH E[£ j£C{x) h j>x (y x )] = E{£ jeC {x) c j}- a 

The next lemma again concerns the cost of the cover produced by an arbi- 
trary decision tree solving the SSSC instance. 

Lemma 7. E[q x (y x )} < £[E iG c*(*) c ol 

Proof. Consider a fixed x. 

Let S C TV. The elements in C*(x) (and their states) form a cover of x, so 
g(C*(x),x) = Q. By the monotonicity of g, it follows that adding the items in 
C*(x) one by one to S will cause the utility of the resulting set (with respect 
to x) to rise from Q — g{S, x) to Q. It follows from the submodularity of g that 
Q - g{S, x) < E ie c* 9s,xU)- 

Since Q-g(S, x) is the coefficient of y s in q x , q x {y x ) < J^scn Ej S c* (9s,x(j))y s - 
Since gs,x(j) is the coefficient of y s mh jiX , Escjv(EieC*(s , s,»(i)))2/s = 52jec*(x) h i 
Taking expectations, combined with Lemma [SJ yields this lemma. □ 

Wc now have our theorem. 



19 



Theorem 4. Given an instance of SSSC with utility function g and goal value 
Q, Adaptive Dual Greedy constructs a decision tree whose expected cost is no 
more than a factor of a larger than the expected cost of the cover produced by 

the optimal strategy, where a — max ^gl^g ^ ~ > ™^ ^ e max taken over all 
x £ {0, 1}™ and S £ F{x) such that the denominator is non-zero. 

Proof. By Lemma [6j E[J2j e c(x) c j\ — a F[q x (y x )]- Since Lemma [7] applies to 
any decision tree solving the SSSC instance, it also applies to the decision tree 
corresponding to the optimal strategy, so E[q x {y x )\ < £Ej e c*(i) c il anc ^ tne 
theorem follows immediately. □ 



7 A 3-approximation for linear threshold for- 
mula evaluation 

We use Adaptive Dual Greedy to obtain our 3-approximation algorithm for 
linear threshold formulas. 

Theorem 5. There is a polynomial-time approximation algorithm solving the 
function evaluation problem for linear threshold formulas with integer coeffi- 
cients, which achieves a solution that is within a factor of 3 of optimal. 

Proof. We modify our previous linear threshold evaluation algorithm from Sec- 
tion 15.31 substituting Adaptive Dual Greedy for Adaptive Greedy. By Theo- 
rem [4j the resulting algorithm is within a factor of a of optimal, where a is the 
approximation factor of Adaptive Dual Greedy. We now show that a < 3 in 
this case. 

Fix x and consider the run of Adaptive Dual Greedy on x. Let T be the 
number of loop iterations. Then C(x) = 3\,--.,3t is the sequence of tested 
items, and F f = {ji, . . . , j t }. Assume first that f{x) = 1. Let F = F° = 0, and 

consider the ratio — 3 -rf^i4^— ■ 

We use the definitions and utility functions from the algorithm in Scction r5.3[ 
Assume without loss of generality that neither go nor g\ is identically 0. 

Let A = -Rmin and let B = R max + 1. Thus Q - g{%, x) = AB. 

Let C\ be the set of items ji in C(x) such that either Xj l = 1 and aj l > or 
x.j l = and aj l < 0. Similarly, let Co be the set of items ji in C(x), such that 
cither Xj t = and aj l > or Xj l = 1 and a J( < 0. 

Testing stops as soon as the goal utility is reached. Since we assumed that 
f(x) = 1, this means testing on x stops when state vector b satisfies gi(b) = Qx, 
or equivalently, b becomes a 1-certificate of /. Thus the last tested item, jx, is 
in Ci. 

It follows that the sum of the aj l Xj l over all ji £ C\(x), excluding j't, is less 
than —Rmin, while the sum including jr is greater than or equal to —Rmin- 
Thus, by the definition of utility function g, Y^j,eCi-ji^i T 50,^00 ^ ^B. The 
maximum possible value of g$, x {jT) is AB. Therefore, 22j,ec7i 9@,x(jl) < 2AB. 
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Further, since x does not contain both a O-certificate and a 1-certificate of 
/, the sum of the a,j l Xj l over all ji £ Cq(x) is strictly less than R ma x- Thus by 
the definition of g, Yj j &Co 9i>,x(ji) < A.B. 

Summing over all ji e C(x), we get that X)j,ec(a) 9<6,x{ji) < 3AB. There- 

fore ' Ej o-g(S) (j) < 3 ' because for F = 0, Q = AB and g(Q,x) = 0. A 
symmetric argument holds when f(x) = 0. 

It remains to show that the same bound holds when F ^ 0. We can re- 
duce this to the case F = 0. Once we have tested the variables in F* , we have 
an induced linear threshold evaluation problem on the remaining variables (re- 
placing the tested variables by their values). Let g 1 and and Q' be the utility 
function and goal value for the induced problem, as constructed in the algorithm 
of Section O The ratio ^fislp^l is equ al to wne re x' 

is x restricted to the elements not in F. By the argument above, this ratio is 
bounded by 3. □ 

We note that there is a connection between the linear threshold evaluation 
problem, and the Min-Knapsack problem. In Min-Knapsack, you are given a 
set of items with values a% > and weights Ci > 0, and the goal is to select a 
subset of the items to put in the knapsack such that the total value of the items 
is at least 9, and the total weight is minimized. We can therefore solve Min- 
Knapsack by simulating the above algorithm on the linear threshold formula 
S™=i a i x i — @> gi vm § the value 1 as the result of each test. 

It is easy to modify the above analysis to show that in this case the ratio a is 
at most 2, because Co is empty. We thus have a combinatorial 2-approximation 
algorithm for Min-Knapsack, based on Adaptive Dual Greedy. (In fact, the de- 
terministic Dual Greedy algorithm of Fujito would be sufficient here, since the 
outcomes of the tests are predetermined.) There are several previous combina- 
torial and non-combintorial 2-approximation algorithms for Min-Knapsack, and 
the problem also has a PTAS ( [8]. cf. [24]). 

8 A new bound for Adaptive Greedy 

We give a new analysis of the Adaptive Greedy algorithm of Golovin and Krause, 
whose pseudocode we presented in Algorithm 1. We show that the expected cost 
of the solution it computes is within a factor of 2(ln(maxj(gh(j)) + l)) of optimal 
in the binary case, where b = (*,...,*). In the fc-ary case, the 2 in the bound 
is replaced by k. Note that maxj(gb(j)) is clearly upper bounded by Q, and 
in some instances may be much less than Q. However, because of the factor of 
k at the front of our bound, we cannot say that it is strictly better than the 
(InQ + 1) bound of Golovin and Krause. (The bound that is analogous to ours 
in the non-adaptive case, proved by Wolsey, does not have a factor of 2.) 

Adaptive Greedy is a natural extension of the Greedy algorithm for (deter- 
ministic) submodular set cover of Wolsey. We will extend Wolsey 's analysis [40] . 
as it was presented by Fujito [15] . In our analysis, we will refer to the dual LPs 
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L x that we defined in Section [51 along with the associated notation for their 
constraints hj iX (y) < Cj and objective function q x (y). 

Fix x G {0, 1}™. Let T x be the number of iterations of Algorithm 1 on x. 
Let b x denote the value of b at the end of iteration t of the while loop on input 
x, and let F x denote the value of F t . (The x in the notation may be dropped 
when it is understood implicitly.) Set 6 X = min^^t-i g [ g ■ 

Define x' as the assignment produced from x by complementing Xj. For 
j 6 N, let k be the value of t that maximizes (9 x )(g b t-i(j,l)) and let I be 
the value of t that maximizes {0 x ')(g b *-i (j, 0)). Let H k = H{g b o{j 1 l)) and 
H = H(g b o (j, 0)), where H{n) denotes the nth harmonic number, which is 
at most (lnn + 1). Let g{j) = max; g { 0j i} gb(j, 0^ when b = (*,...,*). Let 
1] = 1 -Pi- 

We define Y x to be the assignment to the dual LP variables ys setting 
y Fa = 61, y Ft = (0 X +1 - 0t ) for t G {1 . . . T - 1} and y s = for all other S. 

Lemma 8. For every x G {0, 1}" and j £ {1, . . . , N}, the expected cost of the 
greedy solution is at most E[q x (Y x )], where q x (y) = YlscN 9S,x(N — S)y x s . 

Proof. By the definition of Y x , the proof follows directly from the analysis of 
(non-adaptive) Greedy in Theorem 1 of j!6) , by linearity of expectation. □ 

We need to bound the value of the left hand side of each combined constraint 
on assignment Y x . We will use the following lemma from Wolsey's analysis. 

Lemma 9. L/fl] / Given two sequences {a^)f =1 and (f3^)J~Q, such that both 
are nonnegative and the former is monotonically nondecreasing and the latter, 
monotonically non-increasing, and that /3W is a nonnegative integer for any 
value of t, then 

a (l)0<O) + (a (2) _ + . . . + {a (T) _ a (T-l) )/3 (T-l) 

< ( max aWpQ-VHipW)). 

l<t<T 

We now bound the left hand side of each combined constraint on Y x . 

Lemma 10. For every x G {0, 1}" and j G {1, . . . , N}, h'. xj (Y x ) < Cj2H(max k g{x k ))). 

Proof. Without loss of generality, assume Xj = 1. 

The claim in the proof of Lemma|4]holds for Y x also, by the same argument. 
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Therefore, 



= Pj h j , x (Y*) + (l- Pj )h j , x ,(Y x ) 

= p 3 [6lg b o (j, 1) + Z?: 2 (6i - ()[., )g b (j, 1)] 

+ q 3 [ol'9 b o, (j, o) + z^M' - Q* l )9v-i U, o)] 

< Pj (t, h , : (j, l)H k ] + qj [9jg b t-i (j, 0)H 1 } by Lcmmal 

+ Pj l€g b ^ (i, m l \ + * [o l x .g b ^ (i, o)fr'] 
= "ill' \Pj9 b *-i U, i) + '/..'/„• 0', o)] 

+ l x ,H l [q J g bl - 1 (, h O)+p J g bl - 1 (j,l)} 

< CjH k + CjH (due to the greediness of Algorithm 1) 

< Cj 2H(g{ Xj ))) 

< Cj2H(rR&xg(xk))) 

h 

□ 

Theorem 6. Given an instance of SSSC with utility function g, Adaptive 
Greedy constructs a decision tree whose expected cost is no more than a fac- 
tor of2(ma,Xj (In g(j)) + 1) larger than the expected cost of the cover produced by 
the optimal strategy. 

Proof. Although Y x does not satisfy the combined constraints for x, by Lemma 
[TU1 Y x / (2H(m&Xk g{xk))) does. Moreover, it is clear that for any j e N,j <E 
C(x) iff j £ C(x'). It follows that for all S e N, either y$ is assigned the same 
values in h^ x and hj tX i or the corresponding coefficient is 0. We can then utilize 
the arguments in Lemmas |4] and substituting Lemma ITOl for Lemma [3l Then 
the proof of Lemma [7] from the analysis of Adaptive Dual Greedy can be used 
to show that E[q x (Y x )]/2H(m&x k g(x k )) < £E ieC , (j) cj}. By Lemma H the 
expected cost of Adaptive Greedy is at most E[q x (Y x )]. The theorem follows 
immediately. □ 

9 Simultaneous Evaluation and Ranking 

Let f\ , . . . , f m be (representations of) Boolean functions from a class C, such 
that each fi : {0, 1}™ — > {0, 1}. We consider the generalization of the Boolean 
function evaluation problem where instead of determining the value of a single 
function / on an input x, we need to determine the value of all m functions fi 
on the same input x. We are again given cost vector c and probability vector p, 
and our goal is to minimize the expected cost of evaluation. 
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We exploit an approach used by Golovin and Krause to obtain an approxi- 
mation algorithm based on Adaptive Greedy for solving the simultaneous eval- 
uation problem for OR formulas [18] . They achieved an approximation bound 
of In mt + 1 where t is the average number of variables in the m evaluated OR 
formulas. The algorithm is essentially the same as an earlier algorithm of Liu 
et al., who achieved the same bound with a different analysis [5Tj . 

The approach of Golovin and Krause is as follows. First, convert each in- 
dividual formula evaluation problem into a separate SSSC problem instance. 
Then combine those instances into a single SSSC problem instance. We il- 
lustrate the approach with an algorithm for simultaneous evaluation of linear 
threshold formulas. In what follows, we assume that the kth threshold formula 
is Z)"=l a *H X i < Ok- 

Theorem 7. There is a polynomial-time algorithm for solving the simultaneous 
evaluation of linear threshold formulas problem which produces a solution that 
is within a factor of 0(logmD avg ) of optimal where D avg , is the average, over 
k e {1,.. .,m}, ofJ™=i KJ 

In the special case of OR formulas, where each variable appears in at most 
r of them, the algorithm achieves an approximation factor of 2(ln(/? TOax r) + 1) . 
where f3 max is the maximum number of variables in any of the OR formulas. 

Proof. Let g^\ . . . , </ m ' be the m utility functions that would be constructed 
if we ran the algorithm from Section 15.31 separately on each of the m threshold 
formulas that need to be evaluated. Let . . . , be the associated goal 
values. 

Using the conjunctive construction from Lemma[TJ we construct utility func- 
tion g such that g(b) = £™ =1 gW {b) , and Q = £™ =1 Q (fc) . We can evaluate 
all the threshold formulas by running Adaptive Greedy with <?, goal value Q, 
and the given p, and c, until it outputs a cover b. Given cover b, it is easy to 
determine for each fk whether fk(x) = 1 or fk(x) = 0. 

In the algorithm of Section [5~3l for each fk, the associated Qk = O(Dk), 
where Dk is the sum of the absolute values of the coefficients in fk- Since 
Q = Y^k Qk, th° 0(\og(mD avg )) bound follows from the (InQ + 1) bound for 
Adaptive Greedy. 

Suppose each threshold formula is an OR formula. For b £ {0,1,*}™, 

maxz e .ro, 1} <7& ih Z) — if Xj docs not appear in the fcth OR formula, otherwise 
it is equal to the number of variables in that formula. The 2(\n(f3 max r) + 1) 
approximation factor then follows by our bound on Adaptive Greedy in Theo- 
rem O □ 

Applied to OR formulas, the algorithm in the above proof is identical to the 
previous (In mi + 1) algorithm discussed above. 

We modify the above algorithm by using Adaptive Dual Greedy in the algo- 
rithm, instead of Adaptive Greedy. This gives a different approximation bound. 

Theorem 8. There is a polynomial-time algorithm for solving the simultane- 
ous evaluation of threshold formulas problem, which produces a solution that is 
within a factor of D max of optimal, where D max = maxfegji,. ,^ m y J2i=i \ a k t \- 
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Proof. By Theorem 01 the approximation factor achieved by Adaptive Dual 
Greedy is max ^ J 9S , g ^) . 

Q-g{S,x) 

We bound this ratio for the g constructed in the proof of Theorem [7] Let 
Dj = Yh=i \ a n\- Lct d e {M} n and S G Without loss of generality, 

assume 5 = {n' + 1, . . . , n}. In the fcth threshold formula, for i > n', replace Xj 
with dj . This induces a new threshold formula onn — n' variables with threshold 
dk.d = &k — Dk,d whose coefficients sum to Dk.b = Dk — Y^i= n ' a idi- Let b be 
the partial assignment such that 6, = di for i > n' , and 6,; = * otherwise. If b 
contains either a 0-certificate or a 1-certificate for then Qk — g k (S,d) = 0. 

Otherwise, Q k -g{S 7 d) = {0k,b)(Dk,b-0k,b+l), and £\ c/| >d (i) < D fe>6 max{0, 

M + 1}. It follows that ^ig!^ < D*,6 < -D 



Since this holds for each fc, max ^Lg^g ^ ^ D max . □ 

For the special case of simultaneous evaluation of OR formulas, we thus have 
a ^-approximation algorithm, where /3 is the length of the largest OR formula. 
This improves the 2/3-approximation achieved by the randomized algorithm of 
Liu et al. pH] , 

We use a similar approach to solve the Linear Function Ranking problem. 
In this problem, you are given a system of linear functions /i, . . . , f m , where for 
j G {1, ... ,m}, fj is Qj 1 Xi+aj 2 X2+- ■■ aj n x n , and the coefficients are integers. 
You would like to determine the sorted order of the values fi(x), . . . , f m (x), for 
an initially unknown x G {0,1}™. (Note that the values of the fj(x) are not 
Boolean.) We consider the problem of finding an optimal testing strategy for 
this problem, where as usual, x ~ D p , for some probability vector p, and there 
is a cost vector c specifying the cost of testing each variable Xi. 

Note that there may be more than one correct output for this problem if 
there are ties. So, strictly speaking, this is not a function evaluation prob- 
lem. Nevertheless, we can still exploit our previous techniques. For each sys- 
tem of linear equations /i, . . . , f m over x\, . . . , x n , and each x G {0, 1}™, let 
f{x) denote the set of permutations {fj 1 , fj 2 , . . . , fj m } of fx, . . . , f m such that 
fji{ x ) — fhi x ) — ■ ■ ■ — fj m { x )- The goal of sorting the fj is to output some 
permutation that we know definitively to be in f(x). Note that in particular, if 
e.g., fi(x) < fj{x), it may be enough for us to determine that fi(x) < fj{x). 

Theorem 9. There is an algorithm that solves the Linear Function Ranking 
problem that runs in time polynomial in m, n, and D max , and achieves an 
approximation factor that is within 0{\og{mD max )) of optimal, where D max is 
the maximum value o/^™_ 1 ctj-J over all the functions fj. 

Proof. For each pair of linear equations fi and fj in the system, where i < j, 
let fij denote the linear function fi — fj. We construct a utility function jw' 
with goal value Q^. Intuitively, the goal value of g^ is reached when there 
is enough information to determine that fij{x) > 0, or when there is enough 
information to determine that fij{x) < 0. 

The construction of g^' is very similar to the construction of the utility func- 
tion in our first threshold evaluation algorithm. For each i,j pair, let minij(b) 
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be the minimum value of fij(b') on any assignment b' £ {0, 1}" such that b' ~ b, 
and let maxij(b) be the maximum value. Let R m ax(ij) = ma,Xij(*, ...,*) and 
let R m in(ij) = min tj {*, ...,*). 

Let </< : {0,1,*}" — > Z>o, be defined as follows. If R m ax(ij) < 0, then 
g ( < j) (b) = for all b E {0, 1, *}" and Q% j) = 0. Otherwise, for b e {0, 1, *}", let 
9< (b) = min{Rm.ax(ij), R ma x{ij) ~ maxij{b)} and Q)¥> = R m ax(ij)- It follows 
that for b G {0, 1, *}", f t (b') < f 3 {b') for all extensions b' ~ b iff ^(6) = Q^' } . 

We define and Q< symmetrically, so that fi(b') > fj(b') for all exten- 
sions V ~ 6 iff = <2> J) . 

We apply the disjunctive construction of Lcmma[T]to combine and 
and their associated goal values. Let the resulting new utility function be 
and let its goal value be As in the analysis of the algorithm in Section 1531 

we can show that is 0{D 2 ), where D is the sum of the magnitudes of the 
coefficients in fy. 

Using the AND construction of Lemma [T] to combine the g^ we get our 
final utility function g = J2i<j 9 with goal value Q = J2i<j ■ 

We now show that achieving the goal utility Q is equivalent to having enough 
information to do the ranking. Until the goal value is reached, there is still a pair 

i, j such that it remains possible that fi(x) > fj(x) (under one setting of the 
untested variables), and it remains possible that fj(x) < fi{x) (under another 
setting). In this situation, we do not have enough information to output a 
ranking we know to be valid. 

Once g{b) = Q, the situation changes. For each i,j such that fi(x) < fj(x), 
we know that fi(x) < fj(x). Similarly, if fi(x) > fj(x), then at goal utility Q, 
we know that fi (x) > fj (x). If fi (x) = fj (x) at goal utility Q, we may only know 
that fi(x) > fj(x) or that fi(x) < fj(x). We build a valid ranking from this 
knowledge as follows. If there exists an i such that wc know that fi(x) < fj(x) 
for all j 7^ i, then we place fi(x) first in our ranking, and recursively rank the 
other elements. Otherwise, we can easily find a "directed cycle," i.e. a sequence 

ii, . . . ,i m , m > 2, such that wc know that fi 1 (x) < fi 2 (x) < ... < fi m {x) and 
fi m ( x ) ^ fn( x )- It follows that fi^{x) = ... = fi m (x). In this case, we can 
delete $2, •••,/?«, recursively rank and the remaining fi, and then insert 
fi 2 ,. .. , f m into the ranking next to / 4l . 

Applying Adaptive Greedy to solve the SSSC problem for g, the theorem 
follows from the (ln<5 + 1) approximation bound for Adaptive Greedy, and the 
fact that Q = 0{D 2 max m 2 ). □ 
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A Table of notation 



J, i 

Pi 

Ci 

V 
c 
b 

dom(b) 
a ~ b 
D p 

x ~ D p 

Q 
P 

fj 

N 
S 

9s(j) 

g(s,b) 

9s,b(j) 

k 
d 
III 

min(b) 
max(b) 

Rrain 
Rmax 

L x 

ys 

q x 

h' 



(y) 
uxiv) 



y 

x j 

C(x) 
F(x) 

y l 
K 

x 1 

yx 



the ith attribute variable 
probability that variable Xi is 1 
cost of testing xi 

the probability product vector {pi,p2, 

the cost vector {ci, C2, . . . , c„} 

a partial assignment, an element of {0, 1, *} n 

{bi\bi — 1 or bi — 0}, the set of variables of b that have already been tested 

a extends b (is identical to b for all variables i such that 6, ^ *) 

product probability distribution, defined by p 

a random x drawn from probability distribution D p 

goal utility 

maximum utility that testing a single variable Xi can contribute 

utility function defined on partial assignments, returns a positive integer that is at most Q 
the set {1, . . . ,n} 
a subset of N 

the increase in utility produced by adding j to S 

utility of testing only the items in S, with outcomes specified by b 

g(SlJ{j},b)-g(S,b) 

the increase in utility when b is extended by testing variable i with outcome I 
clauses in a CNF 
terms in a DNF 

the number of linear threshold formulas in the simultaneous evaluation problem 
the minimum possible value of a linear threshold function for any extension of b 
symmetric to min{b), but maximum 
min(*, ...,*) 
max(*, ••-,*) 

the dual LP for SSSC for instance x 

the variable in the Dual LP for SSC associated with subset S 
objective function of L x 

Escjv 9S,x{j)VS 

Y,scN( E l9sAj)J)ys 

the assignment to the ys variables after running Adaptive Dual Greedy 

the assignment of x setting Xj to * 

the sequence of tested items, in order of testing 

the set of subsets F l produced over the course of running Adaptive Dual Greedy 
the value of assignment y at the end of the tth iteration 
the value of b on input x after the tth iteration of the loop 
the assignment produced from x by complementing the jth bit 

the assignment to the dual LP variables used in the analysis of the new bound for Adaptive Greec 



30 



